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ABSTRACT 

We present new spectroscopic data for twenty six stars in the recently-discovered Canes Ve- 
natici I (CVnl) dwarf spheroidal galaxy. We use these data to investigate the recent claim of 
the presence of two dynamically inconsistent stellar populations in this system (Ibata et al. 
2006). We do not find evidence for kinematically distinct populations in our sample and we 
are able to obtain a mass estimate for CVnl that is consistent with all available data, including 
previously published data. We discuss possible differences between our sample and the earlier 
data set and study the general detectability of sub-populations in small kinematic samples. We 
conclude that in the absence of supporting observational evidence (for example, metallicity 
gradients), sub-populations in small kinematic samples (typically fewer than 100 stars) should 
be treated with extreme caution, as their detection depends on multiple parameters and rarely 
produces a signal at the 3<x confidence level. It is therefore essential to determine explicitly 
the statistical significance of any suggested sub-population. 

Key words: dark matter — galaxies: individual (CVnl dSph) — galaxies: kinematics and 
dynamics — Local Group — stellar dynamics 



1 INTRODUCTION 

It is now widely accepted that the dwarf spheroidal (dSph) satellite 
galaxies of the Milky Way and Andromeda are the most dark mat- 
ter dominated stellar systems known in the Universe (e.g. Mateo 
1998). Over the past two decades, a significant amount of observa- 
tional work has focussed on quantifying both the amount of dark 
matter in these systems, and its spatial distribution (e.g. Gilmore 
et al. 2007; Walker et al. 2007). Although recent numerical sim- 
ulations have shown that many of the dSphs may not be immune 
to tidal disturbance by the Milky Way (e.g. Munoz, Majewski, & 
Johnston 2008; Lokas et al. 2008), their observed properties still 
require the presence of massive dark matter haloes which protect 
them against complete tidal disruption. The dSphs thus provide us 
with nearby laboratories in which to test dark matter theories. 

Given that dSphs occupy the low luminosity end of the galaxy 
luminosity function, their star formation histories provide useful in- 
sights into the star formation process. Analyses of spatial variations 
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in colour-magnitude diagram morphology provided early evidence 
of population gradients in a number of dSphs (e.g. Harbeck et al. 
2001). More recently, evidence of metallicity gradients has been 
found using spectroscopic estimates of [Fe/H] (e.g. Tolstoy et al. 
2004; Koch et al. 2006; Battaglia et al. 2006). In at least one case, 
that of the Sculptor dSph, the metal-rich and metal-poor popula- 
tions have significantly different spatial distributions and kinemat- 
ics (Tolstoy et al. 2004; Battaglia et al. 2008). Although little evi- 
dence of similar features has been found in other dSphs (e.g. Koch 
et al. 2006, 2007a,b), the presence of dynamically distinct stellar 
populations within dSphs, as well as the complex interplay between 
the dynamical, spatial and chemical properties of their stars, is of 
great interest as it has implications for star formation and galaxy 
evolution. 

It is, however, important to note that although the hierarchical 
build-up of structure in the standard A-Cold Dark Matter (A CDM) 
paradigm implies that satellite galaxies contribute significantly to 
the stellar haloes of their hosts, detailed abundance studies of stars 
in the more luminous dSphs have demonstrated that their properties 
are significantly different from those of the Milky Way halo (e.g. 
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Shetrone et al. 2001; Helmi et al. 2006). Among the significant 
differences between the halo and the dSphs, the more important 
chemical differences are in the alpha-elements (Unavane, Wyse & 
Gilmore 1996; Venn et al. 2004). The observed gradients in the 
heavy element distributions are reproduced by the models of super- 
nova feedback in dSphs developed by Marcolini et al. (2008). Thus, 
it appears that the primordial dwarf satellites, which were disrupted 
to form the Milky Way halo, had stellar populations distinct from 
those seen in the present-day dSphs (Robertson et al. 2005; Font et 
al. 2006). 

Given their high estimated mass-to-light ratios, the observed 
dSphs are usually identified with the large population of sub-haloes 
which are observed to surround Milky Way-sized haloes in cosmo- 
logical simulations assuming a standard ACDM universe. How- 
ever, it was noted early on that the number of dSphs around the 
Milky Way was much lower than the expected number of satellite 
dark matter haloes (e.g. Moore et al. 1999). A number of possi- 
ble explanations for the apparent lack of Milky Way satellites have 
been presented in the literature (e.g. Stoehr et al. 2002; Diemand, 
Madau & Moore 2005; Moore et al. 2006; Strigari et al. 2007; Si- 
mon & Geha 2007; Bovill & Ricotti 2008). All these models are 
based on the reasonable postulate that out of the full population 
of substructures around the Milky Way, the observed dSphs are 
merely the particular subset which (for reasons of mass, orbit, for- 
mation epoch, re-ionisation, etc.) were able to capture gas, form 
stars and survive any subsequent tidal interactions with the Milky 
Way. 

In addition, the ratio between the predicted and observed num- 
bers of dwarf galaxies has decreased significantly in the past few 
years due to the discovery of nine new Milky Way dSph satel- 
lites (Willman et al. 2005; Zucker et al. 2006a,b; Belokurov et al. 
2006, 2007; Walsh, Jerjen, & Willman 2007) in the data from the 
Sloan Digital Sky Survey (SDSS; York et al. 2000). Since the SDSS 
covers only about one fifth of the sky, it is thus likely that the total 
number of satellites surrounding the Milky Way may be at least a 
factor of five larger than previously thought, although the extrapo- 
lation from the SDSS survey to the whole sky requires careful anal- 
ysis (see e.g. Tollerud et al. 2008) . In order to compare the prop- 
erties of the newly discovered satellites with those of sub-haloes 
in cosmological simulations, as well as to confirm their nature as 
true satellite galaxies of the Milky Way, as opposed to star clusters 
or disrupted remnants, spectroscopic observations of their mem- 
ber stars are essential in order to estimate dynamical masses from 
the observed stellar kinematics. The extremely low luminosities of 
these objects (in some cases as low as 10 3 L Q : Martin et al. 2008b), 
present significant observational challenges as the kinematic data 
sets are small, making it difficult to obtain statistically significant 
results. 

The Canes Venatici I (CVnl) dSph is the brightest of the newly 
discovered population of very faint SDSS dSphs (Zucker et al. 
2006a). Ibata et al. (2006) presented spectra for a sample of CVnl 
member stars obtained using the DEIMOS spectrograph mounted 
on the Keck telescope. They identified two kinematically distinct 
stellar populations in this data set: an extended metal-poor popu- 
lation with high velocity dispersion and a centrally-concentrated 
metal-rich population with a dispersion of almost zero. Their anal- 
ysis of the mass of CVnl suggested that the two populations might 
not be in equilibrium as the mass profiles obtained based on the in- 
dividual populations were inconsistent with each other. However, a 
subsequent study of CVnl by Simon & Geha (2007), using a larger 
sample of Keck spectra, did not reproduce this bimodality. 

An important outstanding question is whether the ultra-faint 



dSphs represent the low-luminosity tail of the dSph population, or 
are instead the brightest members of a population of hitherto un- 
known faint stellar systems, distinct from both dSphs and star clus- 
ters. The presence of multiple, distinct kinematic populations in a 
low-luminosity dSph would set it apart from the majority of low- 
luminosity star clusters. In addition, the presence of a spread in the 
stellar abundances would suggest an association with the brighter 
dwarf galaxies and would also be interesting in terms of its impli- 
cations for star formation. It is thus important to determine whether 
the sub-population identified by Ibata et al. (2006) in CVnl is real. 
One goal of our study was to shed some light on this issue by us- 
ing spectra obtained with a different spectrograph to those in the 
previous two studies of CVnl. In addition, we wanted to investi- 
gate the extent to which sub-populations can be reliably detected 
in the very small kinematic data sets which are observable for the 
ultra-faint dSphs. 

In addition to their potential importance for probing the star 
formation histories of dSphs, kinematic substructures can be used 
to test another key feature of the hierarchical structure formation 
paradigm. The fact that dark matter clustering occurs on all scales 
means that the dSph satellites of the Milky Way are likely to be in 
the process of accreting their own population of smaller satellites. 
Although these substructures may not have been able to form their 
own stars, they may be able to acquire stars from their host dSph. 
They would then be detectable as localised populations with mean 
velocity and/or velocity dispersion distinct from that of the dSph. 
Populations with these properties have, in fact, been detected in 
the Ursa Minor and Sextans dSphs (Kleyna 2003; Walker 2006). 
Once a dSph halo begins to fall into the Milky Way, it will cease 
to accrete new satellites as any nearby substructures will rapidly 
be removed by the tidal field of the Milky Way and the high rela- 
tive velocities in the Milky Way halo will preclude the capture of 
new satellites. Due to the short internal dynamical timescales in 
dSphs (typically a few hundred Myr), any remaining internal sub- 
structures will subsequently be destroyed on timescales of at most 
a few Gyrs if dSph haloes are cusped, although they can survive 
much longer if their haloes are cored (Kleyna 2003). In the stan- 
dard cusped-halo picture, only those satellites which have been in- 
teracting with the Milky Way for less than a few internal dynamical 
times, either because they are currently passing the Milky Way for 
the first time as may be the case for the Leo I dSph (Mateo 2008) or 
the Magellanic Clouds (Kallivayalil et al. 2006; Besla et al. 2007; 
Piatek et al. 2008) or because their crossing times are larger (e.g. 
the Magellanic Clouds: van der Marel 2002), would be expected 
to exhibit localised kinematic substructure. If localised substruc- 
tures were found to be common in dSphs, this could be difficult 
to reconcile with a picture in which dSphs occupy cusped haloes. 
Given that the level of substructure above a given mass fraction 
is a function of halo mass (Gao 2004), the expected numbers of 
sub-haloes per dSph requires further investigation by means of cos- 
mological simulations. However, the importance of comparing the 
level of substructure in dSphs with the results of numerical simu- 
lations adds further motivation to our goal of establishing the level 
of confidence with which sub-populations can be detected in small 
data sets. 

The outline of the paper is as follows. In Section 2, we present 
a new kinematic data set for stars in CVnl, based on spectra ob- 
tained with the Gemini telescope, and calculate a mass estimate 
for the galaxy from these data. In Section 3, we look for kinematic 
sub-populations in our data, and compare our findings with those of 
Ibata et al. (2006). Section 3.2 discusses the general detectability of 
sub-populations in small kinematic data sets. Finally, in Section 4 
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Figure 1. SDSS (g - i, i) color-magnitude diagram for stars in a field of 
radius 15 arcmin centred on CVnI. Our likely CVnl members are indicated 
as solid triangles. Two velocity outliers are shown as open circles. 



Figure 2. Distribution of our GMOS target fields in CVnI. The data points 
show the positions of stars satisfying our CMD selection cut. The slight 
excess of stars in the region -10 < X < 10 and -5 < Y < 5 indicates the 
location of the main body of CVnI. The ellipse shows the half-light radius 
of the system, with semi-major axis 8.9 arcmin (Martin et al. 2008b). 



we draw some general conclusions and suggest possible differences 
between the two data sets for CVnI that we have compared. 



2 CANES VENATICI I 
2.1 Data Reduction 

Twenty eight stars in the CVnI dSph were observed on 2007 
March 26 and 2007 April 7 and 8 using the GMOS-N spectrograph 
mounted on the Gemini North telescope. Our targets were chosen 
by cross-matching of GMOS-N pre-images (taken in the i-band) 
with existing SDSS photometry. As Figure 1 shows, all the selected 
stars lie in the red giant branch (RGB) region of the CVnI colour- 
magnitude diagram (CMD). A total of three GMOS slit-masks were 
observed, with the spectra centred on the spectral region contain- 
ing the Ca triplet region (around 860nm). Our masks covered three 
distinct fields in CVnI. Figure 2 shows the locations of the fields 
relative to the spatial distribution of stars in CVnI. The masks were 
cut with slitlets of width 0.75 arcsec. 

The GMOS detector consists of three adjacent CCDs. As the 
dispersion axis of the slits is perpendicular to the spaces between 
the CCDs, the spectra contain gaps corresponding to the inter- 
CCD gaps. In order to achieve continuous wavelength coverage 
throughout the spectral region of interest, each mask was observed 
in two configurations with different central wavelengths (855nm 
and 860nm). All observations were taken using the R831+.G5302 
grating and CaT_G0309 filter, with 2x4 spectral and spatial bin- 
ning, respectively. The spectra thus obtained have a nominal reso- 
lution of 3600. The three fields were observed for a total of 10,800s, 
9,000s and 12,600s, respectively, with the observations divided into 
individual exposures of 1800s to facilitate cosmic ray removal. 

The raw data were reduced using the standard gemini reduc- 
tion package which is run within the Image Reduction and Analysis 



Facility (IRAF) 1 environment. All data were first bias subtracted 
and flat-field corrected. The individual spectral traces were iden- 
tified from flat field images (obtained using Quartz halogen con- 
tinuum lamp exposures). The wavelength calibration of the spectra 
was performed using CuAr lamp exposures adjacent in time to the 
science exposures as calibration frames. The typical r.m.s. uncer- 
tainty in the wavelength calibration, obtained by fitting a polyno- 
mial to the line positions in the CuAr spectra, was O.OlA, which 
corresponds to a velocity error of ~ 0.4kms _1 at a wavelength of 
860nm. This wavelength solution was then applied to the reduced 
science spectra. Sky subtraction was performed by using the sky 
flux in the regions of the slit not dominated by light from the target 
to estimate the sky spectrum. Finally, the object spectra were ex- 
tracted from the CCD images using a fifth order Chebyshev poly- 
nomial fit. Figure 3 shows examples of a good quality spectrum (top 
panel), a low quality spectrum (middle panel), and typical quality 
spectrum (bottom). 

2.2 Velocities 

The velocities of the stars were calculated using the fxcor task 
in IRAF to cross-correlate the stellar spectral lines with the lines 
in a template Ca triplet spectrum. The synthetic template consisted 
of three Gaussian lines at the wavelengths of the Ca triplet lines, 
whose widths were chosen to match those typical of RGB stars. We 
first cross-correlated the individual science exposures as a prelimi- 
nary diagnostic of whether any spectra were obviously anomalous 
and should be excluded. As none of the spectra seemed to have se- 
rious problems, all frames were used in the velocity calculations 

1 IRAF is distributed by the National Optical Astronomy Observatories, 
which are operated by the Association of Universities for Research in As- 
tronomy Inc. (AURA), under cooperative agreement with the National Sci- 
ence Foundation. 
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Figure 3. Three sample spectra for stars of magnitude i= 19.7 (top), i= 20.6 
(middle) and i= 20.1 (bottom). The quality of the spectrum in the bottom 
panel is typical of the majority of our stars. 

and we combined all heliocentric-corrected exposures of the same 
mask together in order to increase the signal-to-noise. 

The fxcor task returns estimated velocity uncertainties which 
are based on the Tonry-Davis Ratio for the fitted cross-correlation 
peak. These errors are often found not to be an accurate reflection 
of the true uncertainties (see e.g. Kleyna et al. 2002; Munoz et al. 
2005). In order to estimate the actual uncertainty in our velocity 
determinations, we measured separately the velocities v t and v 2 for 
the spectra with central wavelengths 855nm and 860nm, respec- 
tively. We combined these estimates to obtain the mean velocity 
for each star v = 0.5(vi + v 2 ) and defined a^ 2 statistic via 

2 _ (vt ~ v) 2 (V2 ~ v) 2 
X (dv { ) 2 (dv 2 ) 2 ' ' 

where dv\ and dv 2 are the formal errors returned by fxcor. We 
then rescaled the velocity errors in our sample by a factor / so that 
the sum of equation 1 over all stars was 2/V, where N is the size 
of the velocity sample. Finally, using the rescaled errors, we cal- 
culated P(x 2 ) for each star, using the routine gaimnq from Numer- 
ical Recipes (Press et al. 1991). The final velocities and errors are 
given in Table 1. Following the error rescaling, only one star was 
found to have an extremely low value of P(x 2 ) (< 10~ 4 ). As Table 1 
shows, this star also has the largest velocity in the sample and a rel- 
atively large estimated velocity error, possibly due its low signal- 
to-noise ratio, and we therefore excluded it from our final sample. 
We also excluded one star which has very different radial velocity 
Vr = -39.1 kms~' compared to mean velocity of the rest of our 
target stars (25.8 ± 0.3 km s -1 ; see Section 2.4). Figure 4 shows the 
velocity histogram for our final sample consisting of 26 stars. We 
note that our sample includes 10 stars from the Ibata et al. (2006) 
and Martin et al. (2007) sample and Figure 5 is a histogram rep- 
resenting the difference between the velocities of these stars from 
both studies in terms of their lcr measurement uncertainties. Thus, 

we calculate Av/(cr) as (v Keck - v GM os)l Jdv 2 Keck + dv 2 GM0S , after 



applying a velocity shift of -3.4 kms 1 to our estimates in order to 
bring the median of the two data sets together. The plot shows that 
apart from the two outliers, at 8.6 and 4.6<x, with very different ve- 
locities in the two sets, the differences are normally distributed. The 
outliers are possibly stars in binary systems which have changed 
their velocity between the two observations. The Ibata et al. (2006) 
data were taken in May 2006, i.e. around ten months earlier than 
our data. The observed velocity differences of 8 - 10km s _1 over 
this baseline are consistent with tight binary orbits. 

2.3 Metallicities 

It is now well-established that the line strength of the near-infrared 
Ca triplet lines in the spectrum of an RGB star can be used to esti- 
mate the [Fe/H] of the star (e.g. Armandroff & Zinn 1988; Arman- 
droff 1991; Carrera et al. 2007; Bosler et al. 2007). We note that 
the accuracy of this method may be less reliable when extrapolat- 
ing below metallicities of 2.2 where globular cluster calibrators 

are missing (Koch et al. 2008), although comparisons of high-vs- 
low resolution data by Battaglia et al. (2008) have shown that CaT- 

based estimates may be correct down to [Fe/H] 3. In practice, 

we normalized the spectra using a seventh order Legendre poly- 
nomial, fitted each of the triplet lines using a Penny function (see 
Cole et al. 2004), and integrated the profile over the standard band 
passes of Armandroff & Zinn (1988). The final [Fe/H] metallicities, 
on the scale of Carretta & Gratton (1997), were calculated using the 
calibration of Rutledge et al. (1997a; 1997b), namely 

[Fe/H] = -2.66 + 0.42[XW + 0.64(V - V HB )L (2) 

where we parameterised the line strength of the Ca triplet as 

I.W = 0.5 * wi + w 2 + 0.6 * w 3 , (3) 

where W\, w 2 and vv 3 are the widths of the individual lines. In Eq. 2, 
V is the V-band magnitude of the star, and V H b is the magnitude 
of the horizontal branch of the system. For the latter, we used a 
value of V H b = 22.4, obtained by visual inspection of the (V-I.V) 
colour-magnitude diagram of CVnI. We note that this is very sim- 
ilar to the value of V H b = 22.5 used by Martin et al. (2008a). The 
uncertainty of ~ 0. 1 magnitudes in V H b gives rise to a negligible 
additional uncertainty in our [Fe/H] estimates. The random errors 
on the [Fe/H] metallicities were calculated using the formalism of 
Cayrel (1988) for the errors on the single line widths and are based 
on the spectral signal-to-noise ratio. These were then propagated 
through the calibration equations, accounting for photometric er- 
rors. The final metallicity estimates are given in Table 1. Figure 6 
shows the distribution of velocity versus [Fe/H] for our CVnI sam- 
ple. The error- weighted mean [Fe/H] is -1.9 + 0.02 compared to 
the value of -2.09 + 0.02 found by Simon & Geha (2007). We note 
that all previous studies of CVnI have found a significant spread in 
[Fe/H], of order 0.5 dex (Ibata et al. 2006; Simon & Geha 2007; 
Kirby 2008), and our value thus lies within the range of previous 
estimates. As the figure shows, there appear to be no obvious cor- 
relations between velocity and [Fe/H] in our sample. 

2.4 Mass Calculation 

In order to estimate the mass of CVnI, we calculate the velocity dis- 
persion of the system using the new velocity set that we obtained in 
the previous section. We use a maximum likelihood method (e.g. 
Kleyna et al. 2004) to calculate the velocity dispersion and the 
mean velocity of our data. We apply an iterative 3<x cut in velocity 
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0.10 


13 2844.26 
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2.47 
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Table 1. Summary of properties of our CVnl data. Columns give: (1) Right ascension; (2) Declination; (3,4) V and I magnitudes, calculated from SDSS pho- 
tometry using the transformations of Lupton (2005: http://www.sdss.Org/dr4/algorithms/sdssUBVRITransform.html#Lupton). Lupton derived these equations 
by matching photometry from SDSS Data Release 4 to Stetson's published photometry; (5,6) SDSS g, i magnitudes; (7,8) radial velocity and error, in km s _1 ; 
(9,10) combined equivalent width of Ca triplet lines, with error, obtained using equation 3; (11,12) estimated metallicity [Fe/H], with error, obtained using 
equation 2. Note that one star (v = -39.1 kms~') is a clear outlier from the mean velocity of v = 25.8 ± 0.3 kms~'. A second outlier has v = 44kms~' and 
also a relatively large error of dv = 3.2 km s -1 for our sample. We therefore exclude these two stars from our analysis. 



- however, we note that this did not remove any stars (ie. it con- 
verged after a single iteration). Based on the CMD in Figure 1, we 
do not expect significant foreground contamination in our velocity 
sample, and we therefore use all 26 of our stars when estimating 
the dispersion. We find a dispersion of cr = 7.9+} |kms _l and a 
mean velocity of v = 25.8 ± 0.3kms~'. The latter is somewhat 
smaller than the value of v = 30.9 + 0.6 km s -1 found by Simon & 
Geha (2007). Ibata et al. (2006) found dispersions of 13.9 km s" 1 
and 0.5kms~' for the two populations they identified. As we dis- 
cuss below, we do not find evidence of multiple populations in our 
data and we therefore quote only a single value for the dispersion. 

Following Ibata et al. (2006) we use our dispersion measure- 
ment to constrain the mass of CVnl. In order to proceed we need 
to parameterise the spatial distribution of our data. We assume that 
our tracer population is drawn from a Plummer distribution and 
we find the scalelength for which the likelihood of the positions 
of our tracer data set is maximised. Based on the positions of our 
tracer stars only, we find a Plummer scalelength of a = 4.62 ar- 
cmin, which is smaller than the value of 8.5 ± 0.5 arcmin found by 
Zucker et al. (2006a) and 8.9 ± 0.4 arcmin found by Martin et al. 
(2008b) for the full stellar distribution. The mass is then calculated 
using the isotropic Jeans equation (Binney & Tremaine 1987, eq. 
4.56), under the additional assumption of spherical symmetry. 

We find a mass of 4.4+}'* x 10 7 M o within the volume probed 
by our data (i.e. out to a radius of 1 1 arcmin). The mass-to-light ra- 



tio is calculated assuming a luminosity of L = (2.3 ± 0.3) x 10 5 
L (Martin et al. 2008b). Assuming symmetric errors we find 
M/L = 192 ± 76M G /L . If we take the value of the scalelength 
reported by Zucker et al. (2006a) to be that of our tracers, we ob- 
tain a mass of 3.3^ X 10 7 M o . We note that both these estimates are 
larger than the mass of M= (2.7 ± 0.4) x 10 7 M Q reported by Simon 
& Geha (2007) using their larger data set. The difference is proba- 
bly due to our assumption of a constant velocity dispersion profile, 
while the assumption of mass-follows-light was implicitly made 
by those authors. Ibata et al. (2006) obtained two very different 
mass estimates using the distinct populations which they identified 
in their data. An important point to keep in mind while dealing with 
small data sets is that the Jeans equations remain valid for density- 
weighted averages of the spatial distributions, velocity dispersion 
profiles and velocity anisotropy profiles of multiple tracer popula- 
tions (Strigari et al. 2007). Thus, it is legitimate to use a data set 
which may contain multiple sub-populations when estimating the 
mass of the system. As long as all sub-populations are in dynami- 
cal equilibrium, this estimate will be more reliable than the noisier 
estimates based on the smaller, individual populations. 
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Figure 4. Velocity histogram for our data set of 26 likely members of CVnI. 
Two obvious velocity outliers (at 44kms~' and -39.1kms~') have been 
excluded in from the figure. 
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Figure 5. Histogram of the normalised differences in estimated velocity for 
the ten stars observed both by us and Ibata et al. (2006). Differences have 
been normalised by the combined error from the two estimates (see text for 
details). Apart from the two significant outliers, the distribution is close to 
the overplotted Gaussian. 



Figure 6. Line of sight velocity versus [Fe/H] for our sample of likely CVnI 
members. The two obvious velocity outliers have been excluded. 



3 SUB-POPULATIONS 
3.1 Canes Venatici I 

As we noted above, Ibata et al. (2006) identified two kinemati- 
cally distinct populations in CVnI. Given the potential importance 
of sub-populations in dSphs discussed in the introduction, we now 
investigate whether our data exhibit any evidence of multiple popu- 
lations. Although Fig.6 shows a wide scatter in the abundances that 
might be due to an extended star formation period, no clear signa- 
ture of distinct sub-populations is seen. In order to confirm this vi- 
sual impression more quantitatively, we fitted our velocity distribu- 
tion with multiple Gaussians and tested the significance of the fits 
using Monte Carlo realisations of our data. Our approach, which 
is essentially a likelihood ratio test, is similar to the KMM test 
(Ashman, Bird, & Zepf 1994) which is designed to detect multiple 
Gaussian populations with different means and dispersions within 
a single data set, although unlike the KMM test we do not include a 
determination of which sub-population the individual stars belong 
to. 

The first step of this process was to fit a single Gaussian to our 
velocity data. We then repeated the fit for a two-Gaussian model in 
which a fraction / of the data belonged to a population with mean 
v7 and dispersion <T\, and the remaining data had mean and dis- 
persion <x 2 . As expected, the two-Gaussian model yielded higher 
likelihoods. In order to determine whether this was only due to the 
increased number of fitting parameters or was a real detection, we 
tested the significance of the results with artificial data. To do this, 
we generated 1000 data sets of 26 stars drawn from a single Gaus- 
sian and calculated the improvement of the fit with a two-Gaussian 
model. The distribution of probability ratios AP is shown in Fig- 
ure 7. The value we obtained for our CVnI data is shown as the 
single dot in the Figure (upper panel). As this point coincides with 
the peak obtained by fitting two Gaussians to artificial data consist- 
ing of a single population, we conclude that we do not see evidence 
for a second population in our data. The panel on the bottom of 
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figure 7 shows the equivalent test for the Ibata et al. (2006) data 
(where we have taken the data for their 26 stars with S/N > 15 
as listed in Table 2 of Martin et al. 2007). In this case the im- 
provement is larger than would be expected to arise by chance in a 
single-Gaussian data set. We note that a similar result is obtained 
when the two data sets are combined. Therefore we conclude that 
there is evidence of a second population containing 40 per cent of 
the total number of stars, in the Ibata et al. (2006) data set, at almost 
the 3cr confidence level. We find that the dispersions of these popu- 
lations are 0.6km s -1 and 13.6 km s~', respectively. Although these 
values are similar to those found by Ibata et al. (2006), we note that 
the populations we have identified may be different to those in that 
paper, as in that case the separation of the populations included an 
explicit velocity cut. 

3.2 Detectability 

Having considered our CVnl data set, in this section we investigate 
the more general question of when sub-populations can be reliably 
detected in small kinematic data sets. A limitation of our study is 
that, for simplicity, we are working entirely with Gaussian popu- 
lations. However, our results will be conservative in the sense that 
mixtures of non-Gaussian populations are likely to be more difficult 
to disentangle. 

The detectability of a sub-population depends on i) the total 
number of stars in the data set; ii) the fraction of stars in the sub- 
population; iii) the difference in velocity dispersion between the 
populations; iv) the observational errors on the velocities. We in- 
vestigate the importance of each of these in turn. 

The total number of the stars is crucial for the detection of 
multiple populations. Figure 8 shows three tests done with data 
sets of N = 120, 60 and 30 stars. In each panel a comparison is 
made between data sets that have either a single population or two 
populations containing equal numbers (N/2) of stars. Motivated by 
the case of CVnl, we consider data sets having two sub-populations 
with cr = 13 km s -1 and cr = 1 km s -1 . The two populations are thus 
clearly distinct and we are thus isolating the effect of the sample 
size in the result. In Figure 8, we plot the improvement in prob- 
ability obtained using a two-Gaussian fit to the single and double 
populations as dashed and solid histograms, respectively. We de- 
termine the l<x and 3<x range of the distribution of values obtained 
from the single population (control) sample , and define a (lcr) 3cr 
detection of a sub-population to be one in which AP is larger than 
the lcr (3cr) limits of the control sample. Although the difference 
between the dispersions is large, as we reduce the sample size, the 
significance of the detection of multiple populations decreases, as 
would be expected. Nevertheless, even for N = 30 stars, in 34.8 per 
cent (97.5 per cent) of cases, the subpopulation is detected at the 
3cr (lcr) confidence level. 

The next parameter that we study is the fractional size of the 
sub-populations. This is important for the CVnl populations, since 
it is possible that a cold population in the centre could have been 
missed in our sample if it contained a smaller number of stars. Our 
preliminary tests showed that a cold population could not be de- 
tected even in a large sample if it only made up ~ 0. 1 of the total 
number of stars. Figure 9 shows results for cold populations with 
fractional sizes / = 0.3 , / = 0.5 and / = 0.7. In this test, the dis- 
persions of the sub-populations are 13kms~' and 1 kms~' and the 
velocity error is 2kms~'. For 120 stars, a 3cr detection was made 
for all the samples with a cold population of fractional size / = 0.5 
and / = 0.7. We found (see Figure 9) that when the cold population 
has a smaller fractional size in the sample i.e. / = 0.3, it was de- 
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Figure 7. Distribution of likelihood ratios AP = log P2 - log P\ between a 
single-Gaussian fit (Pi) and a two-Gaussian fit (P2) to 1000 Monte Carlo 
realisations of 26 stars drawn from a single Gaussian distribution. The sin- 
gle dots indicate the values obtained for our GMOS-N data (upper panel) 
and the Keck data of Ibata et al. (2006) (bottom panel). Although there is no 
evidence of multiple populations in our data, a sub-population containing a 
fraction of 40 per cent of the stars is detected in the Ibata et al. (2006) data. 
See text for a detailed discussion. 



tected in 75.1 per cent (99.8 per cent) of cases at the 3cr (lcr) level. 
It is thus easier to detect a sub-population if its dispersion is larger 
than that of the main population, rather than a cold sub-population. 

We next consider the impact of velocity errors on our ability to 
detect multiple populations with similar velocity dispersions. The 
sub-populations in this case have crj = 7 km s -1 and cr 2 = 4 km s -1 . 
As Figure 10 shows, decreasing the errors from dv = 2kms~' to 
dv = 1 kms~' gives rise to a small change in the distribution of 
AP values. However, this does not lead to a significant increase in 
the probability of detecting the multiple populations. We therefore 
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Figure 8. Histograms illustrating the detectability of sub-populations as a function of sample size. The two histograms in each panel show the distribution 
of likelihood differences AP = log P 2 - log Pi between a single-Gaussian fit (Pi) and a two-Gaussian fit (P2) for data sets having either a single population 
(dashed histogram) or two equal-sized populations (solid histogram). The velocity error is constant: dv = 2 km s . The total number of stars is 120 in the left 
panel, 60 in the middle panel and 30 in the right panel. As expected, the histograms merge together as the number of stars decreases and the detection of a 
sub-population becomes more difficult. 



conclude that velocity errors at the 1 -2 km s -1 (similar to the CVnl 
data) do not affect our ability to identify sub-populations. 

Finally, to see the effect of the difference between the veloc- 
ity dispersions of the populations we investigate samples in which 
the main population has a dispersion of 15kms~' while the cold 
sub-populations have dispersions ranging from <x = lkms~' to 
14kms~'. We consider two sample sizes, with a total number of 
either 120 or 60 stars. We find that even for a relatively large 
sample of stars (N = 120), a 3cr detection is possible for all the 
samples only when o"i/(o"i - <x 2 ) < 1.1, i.e. when the velocity 
dispersions of the individual populations are <Xi = 15kms _1 and 
<x 2 = 1 kms~'. A lcr level detection is possible for all 1000 sam- 
ples for <ri/(<Ti -cr 2 ) < 1.3, in which case the sub-populations' dis- 
persions are <xi = 15kms~', <t 2 = 3kms~'. However populations 
wither, /(en -<x 2 ) < 1.4 and <Xi/(<Ti - <x 2 ) < 1.9 can be detected at 
3<t and lcr levels for 68 per cent of the 1000 samples. For a smaller 
sample (N = 60), a 3<r detection for all samples is not possible for 
even the largest ratios of of a-ylcr^ In this case 3<x and lcr detec- 
tions for 68 per cent of the samples require <Ti /(en - cr 2 ) < 1 .3 and 
C\l(o~\ - cr 2 ) < 1.7 respectively. We note that the claimed CVnl 
populations in Ibata et al. (2006) have an even more extreme dis- 
persion difference <Xi/(cri -cr 2 ) = 1.04. A Monte Carlo experiment 
with 26 stars and this dispersion ratio shows that in this case pop- 
ulations can be identified with 3cr confidence in 90.5 per cent of 
the samples. Table 2 summarises our results for the full range of 
dispersions ratios we have considered. 



4 CONCLUSIONS 




Figure 9. Histograms illustrating the effect of the relative sizes of the popu- 
lations. As in previous figures, the histograms show the distribution of like- 
lihood differences AP = logP 2 - log Pi between a single-Gaussian fit (Pi) 
and a two-Gaussian fit (P 2 ) to the data sets. The histograms are plotted for a 
single population (dot-dashed line), and for double populations including a 
cold sub-population consisting of 30 per cent (dotted), 50 per cent (dashed 
line) and 70 per cent (solid line) of the total sample of 120 stars. See text 
for a discussion. 



In this paper, we have presented a new data set of velocities and 
metallicities for the Canes Venatici I (CVnl) dSph based on spec- 
tra taken with the CMOS-North spectrograph. A maximum like- 
lihood fit to the velocity distribution yields a mean velocity of 
v = 25.8+0.3 km s -1 and a dispersion of <x = 1 -9 + _\\ km s -1 . Assum- 
ing a constant, isotropic velocity dispersion and a Plummer profile 
for the mass distribution, we find a mass of 4.4+] * x 10 7 M o in the 
volume where our tracer stars are located. Although this value is 



larger than the value 2.7 ± 0.4 x 10 7 M o calculated by Simon & 
Geha (2007), this is most likely due to the assumptions made for 
our models and the distribution of our particular subsets of stars. 

One of the original aims of our study was to investigate the 
claimed multiple stellar populations in CVnl. As we discussed 
above, the two previous studies by Ibata et al. (2006) and Simon & 
Geha (2007) did not agree on the existence of a cold sub-population 
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Table 2. Confidence limits for the detection of sub-populations with differ- 
ent kinematics. Columns are: (1) Ratio of main velocity dispersion cr\ to 
the difference between the populations o"i — CT2\ (2) total number of stars in 
the data set; (3) Number of two-population samples for which AP is greater 
than the Icr upper limit of AP obtained from single-population samples; 
(4) Number of two-population samples for which AP is greater than the 3tr 
upper limit of AP obtained from single-population samples. We compare 
populations of 60 and 120 stars containing two sub-populations. In each 
case o"i = 15 km s _1 while o"2 lies in the range 1 km s _1 to 14km s _1 . Each 
dispersion ratio has been tested for 1000 data sets. 



Figure 10. Histograms illustrating the effect of velocity errors on the detec- 
tion of sub-populations with similar velocity dispersions. The total num- 
ber of stars is 120 and the sub-populations contain equal numbers of 
stars. In each case, the double-populated sample with crj = 7kms~' and 
o"2 = 4kms~' is shown by the solid-line histogram and compared to a sin- 
gle population system with cr = 7kms~', shown as a dotted histogram. In 
the top panel, the velocity errors for both histograms are dv = 2 km s~ 1 . As 
this error is relatively large compared to the difference in the dispersion, in 
the bottom panel we repeat the same experiment with dv = 1 km s -1 . 



in CVnI. The two populations found in the former study were puz- 
zling as they led to two different mass estimates. The authors sug- 
gested that this might indicate that the system had recently accreted 
a younger population and was not yet in equilibrium. 

In this paper we looked for evidence of multiple populations 
in our data under the assumption that each population was Gaus- 
sian. Based on this analysis, we concluded that there was no reason 
to suspect the presence of a second population in our data. We also 
applied our analysis to the Ibata et al. (2006) data where we found 



evidence of a statistically significant sub-population with a disper- 
sion of cr = 0.6kms~' (compared to cr = 13.6kms~' for the main 
population). 

Our analysis suggests that there is a qualitative difference be- 
tween our data and those of Ibata et al. (2006). Although further 
data would be necessary to resolve this issue, we note that the spa- 
tial distributions of these two data sets are different, which could 
potentially account for the differences in the detected populations. 
However, our central field is centred close to the blue/young star 
population which Martin et al. (2008a) find in their photometry 
from the Large Binocular Telescope, and which they identify with 
the cold population of Ibata et al. (2006). The exact fraction of stars 
in each population found by Martin et al. (2008a) is currently un- 
clear, however, and so it is possible that we have not picked up any 
stars associated with the cold population. 

We have also carried out a study of the detectability of sub- 
populations in small kinematic data sets. Under the assumption of 
Gaussian populations, we studied the effects of four parameters. 
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We obtained confidence limits for the detection of sub-populations 
in samples with different numbers of stars, different population ra- 
tios and velocity dispersions. We found that reasonable errors on 
the observed velocities do not affect the detectability of the sub- 
populations. For a given sample size, our ability to detect two pop- 
ulations increased as the ratio of their dispersions <r x /cr 2 increased. 
However, even for large crjo^ and equal population size, a sam- 
ple of 30 stars yielded a 3<x detection in only ~ 35 per cent of 
cases. As expected, for larger sample sizes, this detection rate was 
significantly higher. We also showed that a cold population needs 
to constitute a larger fraction of the total sample than is required 
to detect a hot sub-population. This suggests that the robust de- 
tection of the sub-populations associated with any surviving sub- 
haloes within a dSph would require samples of many hundreds of 
velocities. In this case, localised substructures could be detected by 
windowing the data, provided that a window whose spatial size co- 
incided with plausible sub-halo scales would contain a sample of 
at least 100 stars. As such data sets are now becoming available for 
many of the larger dSphs, this test may soon be feasible. We note 
that the claim of multiple global populations in Sculptor (Tolstoy et 
al. 2004) was based on a large data set and is therefore still robust. 

Finally, we note that all our significance tests were based on 
the assumption of Gaussian populations, which was the case for all 
our Monte Carlo samples. However, for real data, the true distribu- 
tions will not be known, and are not necessarily well-approximated 
by Gaussians. It is therefore difficult in a real case to assign a robust 
statistical significance to a particular detection of a sub-population. 

As we have shown, for small data sets, many Monte Carlo re- 
alisations do not yield significant detections of the sub-populations. 
In the absence of a robust estimate of the confidence level of a par- 
ticular detection, or additional, independent evidence of the pres- 
ence of multiple populations, we conclude that one should exercise 
great caution in decomposing data sets of fewer than 100 stars into 
multiple populations. 
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